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NOTE 


As  1  tvas  prsparlng  the  manuscript  of  this  Report  for  typing,  1  discovered, 
on  April  22,  1960,  that  my  method  of  approximation,  at  least  when  restricted  to 
polynomials,  is  not  essentially  new,  A  very  similar  method  Is  described  in  [ol. 
For  a  comparison  of  the  two  methods,  see  §  5. 


AN  ALGORITHM  FOR  FINDING  RATIONAL  APPROXIMATIONS  (^) 


H.  F.  Mattson,  Jr, 


§1.  ABSTRACT 

Solentlfio  work  frequently  requires  numerical  values  of  functions.  Although 
much  general  information  Is  often  known  about  these  functions,  values  of  them  are 
nevertheless  often  difficult  to  compute;  althou^  in  almost  all  cases  some  method  for 
making  thiR  computation,  however  lengthy  it  may  be,  is  known.  The  purpose  of  a 
rational  apprcximation  to  a  function  is  to  provide  a  rapid  and  convenient  way  to  calculate 
numerical  values  of  the  function  to  within  a  predetermined  error.  This  paper  considers 
the  question  of  how  to  find  rational  approximations  to  given  functions.  In  §  2  there 
appear  definitions  of  terms,  a  precise  statement  of  what  our  criterion  of  best  fit  is, 
and  statements  of  some  classical  results.  In  §  3,  two  closely  related  iterative  methods 
for  finding  best  rational  approximations  are  defined.  In  ^4,  a  proof  of  convergence  of 
these  methods  is  given  for  a  special  case  (in  which  both  methods  are  the  same).  In  §  5, 
these  methods  are  compared  with  some  others.  In  §  6,  some  results  obtained  by  one 
of  the  methods  of  §3  are  presented,  together  with  a  brief  description  of  the  computer 
program  used  to  obtain  them. 

§2.  DEFINITIONS  AND  KNOWN  RESULTS 

A  few  preliminary  definitions  are  necessary  to  this  discussion.  We  shall  restrict 
ourselves  to  the  finite  interval  I  e  (  a,  b]  on  the  real  line.  If  we  consider  the  space 
of  all  continuous  functions  (with  real  values)  defined  on  1,  it  is  natural  and  commonplace 
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t4  define  the  distance  between  two  such  functions  h  and  g  as  the  maximum  absolute 
vjilue  of  the  difference  h  -  g; 

max  I  h  (X)  -  g  (X)  I  =  |!  h  -  g  jj  . 
a  <  X  i  b 

Itj  general,  we  define  ||  G  jj  ,  for  any  continuous  function  Q  defined  on  I,  as 

II  oil  =  max  I  G  (X)  I  . 
a < X  <b 

Throughout  this  paper  f  will  denote  a  fixed,  continuous  function  over  I.  (We 
liall  impose  an  additional  restriction  on  f  at  one  point  later  on. )  For  given  non- 


negative  Integers  m  and  n,  we  consider  the  family  F 


Qt 


at 


P41 


p 


us 


US 


m,  n 


of  all  rational  functions 


of  the  form  S(x)  =;  P(x)  /Q(x),  where  ?(>)  is  a  polynomial  of  degree  at  most  m,  and 

;x)  is  a  polynomial  of  degree  at  most  n  having  no  zeros  in  1.  Each  S  in  F  is 

m,  n 

a  certain  distance  dg  =||S-f/|>0  from  f.  The  set  of  numbers  dg  ,  with  S  in 
|n,n  ^  greatest  lower  bound  d,  a  notation  which  will  be  fixed  throughout  this 
per.  The  questions  on  rational  approximation  which  naturally  arise  are  the  following; 
there  an  R  in  F^^  such  that  dj^  =  d?  If  so,  what  more  can  we  say  about  R?  In 
irticular,  how  can  we  find  it? 


The  answer  to  the  first  question  is  yes:  There  exists  an  R  in  Fj^  ^  such  that 
<dg  for  every  S  in  f  1,  p.  53j  For  this  R,  then,  dj^  =  d.  Furthermore, 


R 

iLCh  R  is  u  ique;  that  is,  if  S  is  any  rational  iunction  in  Fj^^  different  from  R,  then 


R 


d  fl,  p.  56]  .  R  will  always  denote  this  best  approximation.  "Best",  or 


best-fitting",  is  here  used  in  the  sense  previously  defined;  it  is  often  called  "best  in 
the  sense  of  Techebyshev"  in  the  literature. 

We  now  quote  two  important  theorems  on  rational  approximations  which  will  give 
more  information  about  R  and  d.  (no  pun  intendec^  The  first  theorem  will  allow 
to  find  lower  bounds  for  d;  the  second  characterizes  R  in  terms  of  some  properties 
which  will  prove  to  be  useful  later  on. 


1)  Numbers  in  square  brackets  refer  to  the  bibliography  at  the  end  of  this  Report. 
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Let  be  any  rational  function  in  F  so  R  «=  P  (x)  /Qq^),  with 

o  nij  u  w  w 

+  ^»^0,  0<M<m,  and 

0  S  V  s  n.  Define  NBm  +  n  +  2-  S,  where  6  =  min  (m,  v). 

AsBume  also  that  R^  (x)  0  and  that  (3^  and  (x)  have  no  common  divisor. 

/ 

THEOREM  A.  Suppose  that  the  error  fimotion  =  R^  -f  at  some  N  points 

N 

Xj  <...  <Xj^  in  I  assumes  respectively  the  values  -Vp  +  Vg,  -Vg,  ...,  (-1)  Vj^ 
different  from  zero  and  of  alternating  signs  (thus  all  v's  have  the  same  sign). 

If  Rj^  is  any  member  of  with  error  function  Ej^  =  R^^  -f,  then 

dj^  =11  Ej^ll  >  min  [l  v^l  ,  I  •}  • 

(If  p^  =  0,  then  the  same  inequality  holds  with  N  =  n  +  2.) 

A  consequence  of  this  theorem  is  that  d  =  dj^  >  min  f  •  Vi  I »  •  •  • »  I  '"n  0  • 

And  if  Rj^  is  any  member  of  Fjjj  ^  the  error  function  of  which  has  sufficiently  many 
extreme  values  of  alternating  signs,  then  d  is  not  less  than  the  minimum  of  these 
extreme  values  (in  magnitude). 

THEOREM  B.  Rq  is  the  best-fitting  rational  function  R  if  and  only  if  there 
exists  at  least  N  points  <  .. .  <  Xj^  in  I  at  which  E^  =  R^,  -f  assumes  the  values 
Eq(Xj)  =  (-1)^'*^’^IIEqH,  i=  1,  ...»  N,  where  et  is  either  0  for  all  i  or  1  for  all  i. 

(R  =  0  is  the  best-fitting  rational  function  in  Fj^^  ^  if,  and  only  if,  there  are  at 
least  N  =  n  +  2  points  x^<...  <  Xj^  for  which  f  (x^)  =  (-1)  '  I  If  1 1 .) 

For  the  proof  of  Theorem  A,  see  f  1,  pp.  52-53  J,  A  proof  of  Theorem  B  also 
occurs  in  (l,  pp.  55-57  J  . 

In  proving  the  uniqueness  of  R  fl,  pp.  66-5?],  mentioned  earlier,  one  uses 
Theorem  B. 
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We  shall  use  the  terms  "extremum"  and  extreme  value"  as  follows:  An  extremum 


of  the  function  g  is  a  point  in  I  at  which  g  takes  an  extreme  value.  If  we 

define  an  admissible  set  of  extrema  of  (E^  =  ~f)  to  bo  N  points  Xj^  <  . . .  <  Xj^ 

in  I  such  that 

1)  each  Xj^  is  an  extremum  of  E^j,  and 

2)  £q(x  4-  0  and  alternates  in  sign  as  i  increases  from  1  to  N. 

The  set  1  •  •  •  >  Nj  is  then  called  an  admissible  set  of  extreme- values 

Finally,  let  me  observe  that  the  only  extreme  (of  any  function  G)  pertinent  to 
the  situation  under  discussion  in  this  Report  are  the  maxima  of  /  G  I .  Therefore 
"extrema"  should  be  tacitly  so  understood  here. 


§  3.  DEFINITIONS  OF  ALGORITHMS 

We  now  define  two  iterative  procedures  which  in  some  cases  (see  §  6)  are  known 
to  converge  to  the  best-fitting  rational  approximation  to  our  given  continuous  function 
f  over  I  =  f  a,  b  ]  .  (There  are  no  ca:-..“s  known  to  me  in  which  the  procedures  do  not 
converge  to  f,  but  a  proof  of  convergence  is  known  to  me  only  for  the  restriction  to 
a  =  v,  icC  •) 

We  shall  call  our  first  algorithm  the  "non-linear"  algorithm  and  our  second  the 
"linear"  algorithm.  Both  have  in  common  the  following  first  step: 


Ro  £  F. 


For  a  given  m,  n  select  (say,  by  Interpolation  at  the  estimated  zeros  of  E) 
such  that  £  =  R  -f  has  an  admibsible  set  of  extrema. 


Nonlinear  Method: 

2.°  Given  R.  ,  F  ,  such  that  E.  .  =  R,  ,  -f  has  an  admissible  set  of 

j-1  m,n'  j-1  J-1 

extreme  x  ,  ...,  x„,  determine  R,  £  F  by  imposing  the  N  conditions 
1  '  N  j  m,n 


1 

¥ 


R  (x^)  -f  »  (-1)  Vi  1  ■  •••»  N,  where  Is  ao  unknown.  The  other  unknowns 

arsj  of  coursO;  the  coeffioiests  occurring  in  Pj  and  Qj  in  R^ . 

Linear  Method; 

2°.  Given  R^_j^  =  “  ^j-1 

admissible  set  of  extrema  Xj<..,<  determine  Rj  *=  ^  n  ^ 

oondltlons 


P^(x^)  -£(x^)  Qj(x.j)  =  (-l)^Qj_i  (Xj)  yj ,  i  =  1,  ...»  N. 

Choose  the  leading  coefficient  of  each  Q.  to  be  1.  Notice  that  these  equations  are 

^  2) 

linear  in  the  unknowns  yj  and  the  coefficients  occurring  in  Pj  and  Q^. 


Comments  on  these  methods; 

The  motivation  for  the  non-linear  method  clearly  comes  from  the  characterization 
of  the  best- fitting  rational  function  in  Theorem  B.  The  desired  best-fitting  rational 
function  is  obviously  a  fixed  point  of  the  transformation  defined  in  Step  2°,  (modulo 
the  complications  mentioned  in  footnote  2).  The  linear  method  Is  derived  from  the 
non-linear  one  in  an  obvious  way^  and  it  also  has  the  desired  best-fitting  rational  function 
as  a  fixed  point  (modulo  the  complications  mentioned  in  footnote  2} . 

It  is  not  known  whether  either  of  those  methods  actually  stays  inside  ^  in 
general.  That  is,  for  some  j,  R-  as  defined  in  either  method  may,  a  pi’iorl,  fail  to 
exist.  Also,  for  some  J,  Rj  may,  a  priori,  have  a  pole  in  I.  If  one  satisfied  these 
pre-conditions,  the  central  question  would  remain,  namely,  whether  one  or  the  other 
method  converges.  These  three  questions  appear  to  be  difficult  for  the  general  case 


2)  It  may  happen  tliat  -f  has  more  than  N  extrema.  In  such  a  case  I  require  the 
choice  of  a  particular  admissible  set  of  extrema.  I  have  stated  the  method  in  its  simplest 
form  here,  for  clarity;  it  appears  in  full  generality  as  step  2'®  in  §4. 
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(i.  e. ,  m  and  n  arbitrary,  f  any  continuous  function  on  1)  of  the  non-iinear  method 
and  more  difficult  for  that  of  the  linear  method.  But  in  the  case  n  =  0  both  methods 
coincide,  and  a  considerable  number  of  the  necessary  results  can  be  proved  for  arbi¬ 
trary  f.  Finally,  if  one  then  restricts  f  to  have  m  continuous  derivatives,  a  full  exis¬ 
tence  and  convergence  proof  is  possible.  We  present  this  proof  la  the  next  section. 


§  4,  PROOFS  FOR  THE  CASE  n  c  0 

We  now  confine  ourselves  to  polynomial  approximations  Pj  to  our  given  f. 

Both  methods  are  the  ^ame  in  this  case.  Our  first  step  is  to  satisfy  the  pre-condi¬ 
tions  by  proving  the  existence  of  given  a  P^  having  the  required  properties.  The 

proof  leads  naturally  to  further  results,  all  of  which  we  include  in  the  following  theorem. 


THEOREM  1.  Let  P^  be  a  polynomial,  either  0  or  degree  m  -  p,  where 
0  <  ii  <  m,  such  tnat  there  are  N  =  m  +  2  points 
(a  <  )  Xj^  < . . .  <  ( <  b) 

at  which  the  error  function  E^  =  P^  -f  takes  on  respectively  the  non-zero  values 

s  i  =  1,  ...»  N,  where  all  vj  have  the  same  sign.  Then  there  exists 

a  polynomial  Pj^  of  degree  at  most  m,  and  a  number  y  such  that  the  error  function 
Ej^  =  Pj^  -f  takes  the  values 

and 

a)  y  is  uniquely  determined  by  the  condition  (•), 

b)  If  not  all  vj  are  equal,  then 

min  I  Vj  l<  I  y  (  <  max  |  Vj  I  (therefore  y  /  0 
l<i<N  lli<N 

c)  Sign  y  =  sign  [vj|  . 
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PROOF.  We  are  given  N  distinot  poicta  at  which  there  hold  the  equations 

^0^*1^  “  +H)^v^3  “0,  i  =  1,  ...»  N. 

These  equations  may  be  thought  of  as  N  homogeneous  equations,  already  solved, 
for  the  N-1  coefficients  of  an  m-th  degree  polynomial,  plus  one  more  "unknown". 

That  is,  (0,  ...,  0,  Sjij,  1)  is  the  solution-vector  (transposet^,  and 

the  N  X  N  coefficient-matrix  is 

x^“  Xi“  ^  ...  1  -[f(Xj^)-v^l 

***  ^ • 

The  existence  of  with  the  given  properties  imply  that  the  determinant  of  the  matrix 
M  is  zero. 

What  wa  first  wish  to  prove  is  that  there  is  a  number  y  such  that  the  matrix  we 
obtain  by  replacing  each  v^  by  y  in  M  is  also  zero.  (Such  a  y  would  imply  the  existence 
of  the  desired  Pj^.)  To  this  end  we  expand  det  (M)  by  cofactors  of  the  last  column,  ob¬ 
taining 

^  (-1)^'^^  ej  [f(xj)  +  (-l)^vj  =  0  (4.2)  ; 

i  =  1 


q 

M  B 
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where  the  minor  e^  Is  the  van  der  Monde  determinant 


m 

• 

• 

• 

X 

^2  *  *  * 

1 

m 

m-l 

1 

• 

• 

• 

m-l 

1 

N 

N 

(this  row  omitted)  ; 


thus  we  have,  by  the  well  known  formula, 


®l  = 

}<k 

J, 

Since  the  x^'s  are  all  distinct,  each  ^  0;  since  the  x^'s  are  arranged  in  Increasing 
order,  all  e^  have  the  same  sign,  each  one  being  the  product  of  the  same  number  of 
negative  factors.  Having  noted  these  properties  of  the  e^'s,  we  now  rewrite  equation 
(4. 2)  as 

N  ^ 

H  (-1)  f  (x^)  =  df  e.  (4.3) 

i  =  1 

This  shows  that  |  e  i  I  e^vj  >  0  and  that  sign  e  =  (sign  e^^)  (sign  Vj^).  We  can 

now  immediately  satisfy  our  requirements  for  the  existence  of  Pj^  by  choosing  y  to 
satisfy 

Oi  y  +  . .  +  e^^  y  =  6,  (4„ 

N 

of  y  E  e/T^ 
i  =  1 


e^ .  Furthermore,  this  is  the  only  choice  open  to  us  for  y. 


A  comparison  of  (4.2)  and  (4.3)  shows  that  min  j  j  <ly  I  •<  max  |  I  unless 

all  Vj  are  equal  (in  which  case  y  =  v^);  and,  finally,  sign  y  =  sign  e.  sign  dj  «= 

sign  .  QSD. 

There  are  two  points  to  notice  about  this  theorem.  One  is  that  we  do  not  re- 
qulrs  the  Xj’s  to  be  extrema  of  but  only  to  be  points  where  alternates  in  sign. 
In  the  application  of  this  theorem  to  our  iterative  process,  however,  we  shall  take 
them  as  extrema. 

The  other  point  is  that  although  may  have  degree  less  than  m,  the  same  is 
not  necessarily  true  of  ,  as  the  following  example  shows: 

Lot  f(x)  =  cos  tr  X,  over  I  =  ^  a,  b]  =  [  0,  5/2  ]  .  Take  m  =  2,  so  that  N  =  4. 
For  Pq,  take  P^  (j^  =  a^,  with  0  <  a^  <  1.  Then  we  have 

Xj^  =  0  X2  *=  1  ^3  ^  ^4  “ 

Vj^  *  I-Sq  Vg  =  l+Up  Vj,  =  1-a^  v^  =  a^  , 

as  is  obvious  from  the  following  sketch: 


This  same  sketch  makes  it  obvious  that  there  is  no  straight  line  u  =  Pj^(x)  =aj^  x  + 
having  deviations  of  equal  magnitude  at  the  x^'s.  Therefore  Pj^  will  be  a  parabola, 
something  like  the  one  sketched  with  a  dashed  line. 
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What  we  have  proved  in  Theorem  1  is  that  we  can  always  construct  our  sequence 
satisfying  the  conditions  of  step  2°,  provided  there  exists  a  first  polynomial 
satisfying  the  condition  of  step  1°.  But  we  can  always  find  P^  by  solving  Po  (Xj) 
-f(x^)  =  (-l)^yQ  for  the  unknown  coefficients  in  P^^  and  the  unknown  y^,  for  any 
distinct  Xj^,  . . . ,  Xj^  £  I.  ^  The  existence  of  this  P^  and  y^  is  given  by  the  proof  of 
Theorem  1,  in  which  y^  is  given  by  (4.4)  with  e  defined  by  (4.3).  It  is  only  neces¬ 
sary  to  choose  the  x^  so  that  e  ^  0,  the  possibility  of  which  follows,  in  case  f  is  not 
a  polynomial  of  degree  at  most  m,  from  the  existence  of  the  best-fitting  polynomial 
and  the  consequent  existence  of  v^'s  with  alternating  signs. 

Having  shown  the  existence  of  our  sequence  {Pj}  ,  we  now  show  that,  under 
suitable  restrictions,  it  converges  to  P. 

We  first  must  modify  our  definition  of  step  2°  to  take  account  of  the  possibility 
that  there  are  more  than  one  set  of  admissible  extrema.  We  prove  a  simple  lemma; 

LEMMA  1.  If  Ej  =  Pj  -f  has  an  admissible  set  of  extrema,  then  there  is  an 
admissible  set  of  extrema  values  containing  +  1 1  Pj  -f  1 1  =  *dj  . 

PROOF.  Let  *dj  occur  as  a  value  of  Ej  at  x.  If  x  is  already  in  the  admissible 
set  of  extrema,  then  we  are  done.  If  x  lies  between  two  of  the  extrema  x'  and  x", 
then  Ej  (x)  must  have  the  same  sign  as  one  of  Ej  (x')  and  Ej  (x").  Replace  that  one  by 
Ej(x).  If  X  lies  entirely  to  the  left  of  the  admissible  extrema,  then  either  replace  the 
one  nearest  x  by  x,  or  delete  the  farthest  one  and  include  x,  all  depending  on  whether 
or  not  the  sign  of  Ej  at  the  one  nearest  x  is  the  same  as  that  of  Ej  (x).  QED. 


3)  lam  indebted  to  Novodvorskii  and  Pinsker,  ^4  j,  via  Shenitzer  j^ej  ,  for  this 
point.  It  is  slightly  easier  to  prove  the  possibility  of  this  than  to  show  the  possibility 
of  avoiding  a  tangency  of  Pq  and  f  when  interpolating  to  f  at  the  estimated  zeros  of 
E,  which  I  had  suggested  earlier.  In  order  to  obtain  linearity  of  the  equations  for  R^ 
when  n  >  0,  however,  one  needs  to  interpolate  as  first  suggested. 
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It  follows  immediately  from  this  lemma  that  if  we  choose  a  set  of  admissible 
extreme  values  having  the  largest  possible  minimum  (in  magnitude),  then  we  mB.y  re¬ 
place  it  by  a  set  containing  1 1  Ej  1 1  without-  changing  the  magnitude  of  the  minimum 
extreme  value  in  it.  It  is  this  procedure  that  we  follow  when  there  is  more  than  one 
choice  of  a  set  of  admissible  extreme  values  at  any  stage.  Accordingly,  we  substitute 
the  following  step  for  step  2° ; 

2*0.  Given  Pj  £  ^  such  that  Pj  -f  has  a  set  of  (N)  admissible  extreme  values, 

choose  a  set  S  of  admissible  extreme  values  containing  the  largest  possible  minimum 
(in  magnitude)  and  containing  IIPj  -f  II.  Let  x^^,  . . . ,  x^j  be  the  extrema  correspond¬ 
ing  to  this  set  S  (i.  e. ,  S  =  {  E .  (x^)  i  =  1,  . . .  ,  N  }  ).  Determine  Pj  +  i  (x)  = 

Sg  X™  +  . . .  +  a^  by  imposing  the  N  conditions 

Pj  +  1  (''l)  ^ 

where  v  is  an  unknown. 

■  j  +  1 

The  possibility  of  carrying  out  step  2’®  has  been  proved  in  Theorem  1  and 
Lemma  1. 

We  now  turn  to  the  proof  of  convergence, 

LEMMA  2.  The  sequence  {I  Yj  determined  by  step  2'°  is  strictly  increasing 
and  bounded  above  by  d  =  1 1  E  -f  1 1  ;  (unless  some  P^  =  P;  then  all  |  yj^  |  =  d  for  k  >  j  ). 

PROOF.  We  are  given  that  j  Vj  I  =  Ej  (Xj)  |  ,  where  the  are  the  extrema 
belonging  to  the  set  S  of  extreme  values  of  Ej_j^  defined  in  step  2'°.  Let 
Let  g.  _  ^(-1)*'*’  ^  v*  J  be  a  set  of  extreme  values  of  E.  as  prescribed  by  step  2'°. 
Since  there  is  an  admissible  set  of  extreme  values  of  Ej  such  that  j  yj  I  is  smaller  (in 
magnitude)  than  all  of  them,  it  follows  that 

I  yj  I  <  min  )  v^  I  .  (^"^) 
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Now  (4. 5)  plus  Theorem  1  yields  I  Yj  I  <  min  j  Vj*  |  <  I  ^  ^  •  And  finally, 

Theorem  A  gives  I  yj  I  5  I  I  5  ^ ^  ^ 

We  now  prove  a  convergence  theorem. 

THEOREM  2.  Every  convergent  subsequence  of  {Pj|  limit  P. 

PROOF.  Every  subsequence  {Fj  |  >  convergent  or  not,  has  the  property  that 
the  sequence  of  attached  I  yj^^  !  converges  to  some  limit  d'  <  d.  Let  P*  be  the  limit 

Urn  P,  of  our  convergent  subsequence.  Then  P*  is  a  polynomial  of  degree  at  most 
k-»o® 

m,  (because  the  sequences  of  coefficients  of  the  are  bounded;  therefore  there  is  a 
subsequence  of  the  subsequence  Pj^  which  has  a  polynomial  p(3^  as  limit.  But  since 
the  subsequence  Pi  Is  already  convergent,  the  limit  P*  must  be  the  polynomial  p(x) 

Just  mentioned.  Finally,  the  coefficients  of  the  Pj^  are  bounded  because  the  polynomials 
are  bounded  at  m  +  1,  in  fact  at  all,  points  of  1.)  Thus  P*  is  the  uniform  limit  of[Pj  V 

We  shall  prove  that  d«  =  M  P*  -f  1 1  First,  El  =  P*  -f  Has  an  admissible  set  S  of 
extreme  values,  since  P*  is  the  uniform  limit  of  {  Pj^J  ;  and,  for  the  same  reason, 
the  set  S  has  d*  as  minimum  magnitude,  in  view  of  the  inequalities  (4. 1)  of  Theorem  1. 
Therefore  we  may,  and  do,  apply  step  2'o  to  P*,  obtaining  P**.  If  II  E*  II  >  d’,  then 

the  value  of  the  y*  found  in  step  2’°  (satisfying  E**  (extrema  of  E*)  =*y*)  would 

satisfy  (y*)  >d*.  We  choose  k  large  enough  so  that  corresponding  admissible  extrema 

of  E  and  E*  are  close  enough  to  each  other  to  yield  |  yj  I  so  close  to  I  y*  I  that 
Jk  h 

jy.  I  >  d*.  This  is  possible  since  y,  determined  by  formula  (4.4),  depends  continu- 
'k 

ously  on  the  extrema.  But  this  result  contradicts  the  definition  of  d*.  Therefore 
I  I  E*ll  =  d*  ^  d.  But  since  we  always  have  1 1  E*  I  I  >  d,  we  have  proved  that  d  =  d'. 

Bv  the  uniqueness  of  P,  we  conclude  that  P*  =  P.  QED. 
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Now  we  are  almost  finished,  for  if  we  knew  that|pj|were  a  bounded  sequence, 
wo  could  conclude  from  the  previous  theorem  that  |  converged  to  P,  We  now 
Introduce  a  hypothesis,  probably  stronger  than  necessary,  which  Implies  the  bounded¬ 
ness  of  {Pj\  • 

THEOREM  3.  If  f  has  m  continuous  derivatives,  then  the  sequence 
constructed  according  to  either  steb  2°  or  step  2*°  is  bounded  over  I. 

PROOF.  Each  error  function  Ej  =  Pj-f  has  at  least  N  -2  =  m  distinct 
extrema  interior  to  I.  The  existence  of  a  continuous  first  derivative  implies  that 
E^'  has  at  least  m  distinct  zeros  inside  I.  Therefore  E'j  has  at  least  n  ^1  distinct 
extrema  and  E'j  an  equal  number  of  zeros  inside  I.  Continuing  in  this  way,  we  find 
that  has  at  least  one  zero  inside  I.  Therefore  we  have  (z»)  =  ml  a^^  = 

^  /rv>\ 

f^™^  (z')  for  some  z'  in  I,  where  is  the  leading  coefficient  of  P^.  Since  f  is 

•  r  /wiV  7 

assumed  continuous  on  I,  it  is  founded  there,  from  which  it  follows  that  1 

bounded. 

From  the  relations 

X 

=  J  Pj^^^  (  t)  dt  +  (z),  of  *  m,  m -1,  ...,  1, 

z 

where  z  is  a  zero  of  we  conclude  inductively  that  is  bounded,  since 

all  derivatives  of  f  are  bounded,  for  order  not  greater  than  m. 

Incidentally,  we  could  conclude  easily  from  the  above  proof  that  the  coefficients 
of  the  Pj  are  all  bounded;  is  bounded  since  P^^™^  =  m  I  a^^  is  bounded;  one 

integration  introduces  the  a^^.  which  are  therefore  bounded,  and  so  on.  But  we  already 
know  the  boundedness  of  the  coefficients  follows  from  that  of  the  polynomials,  in  general. 

4)  We  would  have;  In  a  complete  matrlc  space  (here  the  Euclidean  (m  +  1) -space  of 
coefflolents)  a  bounded  sequence  with  at  most  one  limit  point  converges. 
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We  summarize  the  import  of  this  section  in  the  following  theorem. 


THEOREM  6.  Let  ft  on  I,  and  let  the  sequence  of  polynomials 
be  defined  by  the  rules;  -f  has  an  admissible  set  of  extrema;  each  Pj,  J  >  0,  is 
obtained  from  P^  ^  by  the  rule  of  step  2‘0.  Then  {P^]  converges  uniformly  on 
I  to  P,  the  polynomial  of  degree  at  most  m  which  lies  nearest  to  f  in  the  matric 
defined  in  ^  2  , 


§  5.  COMPARISONS  WITH  OTHER  METHODS 

The  methods  of  Remez  ,  represented  in  some  fashion  in  ^4 j ,  is  described 
in  English  in  8  for  approximations  of  the  form  8(j^  P(x),  where  s  is  a  fixed  con¬ 
tinuous  function  with  no  zeros  in  I,  and  P  is  a  polynomial.  Proofs,  said  to  be  given 
in  [4]  ,  are  omitted  from  [sj  .  The  latter  reference  is  the  only  one  I  have  been 
able  to  read  to  date. 

We  now  describe  Remez* s  method  in  our  terminology; 

The  method  begins  by  the  choice  of  the  initial  approximation  P^  as  described 
in  §4;  For  Xj^  <...<Xj^£I,  we  determine  P^  by  the  N  equations  P(j(xp  -fCxj)  = 
(-l)^y^.  Step  2.  Let  x*  be  a  point  of  I  at  which  E^  =  P^  -f  takes  the  value  i  1 1  E^l  I . 
Replace  one  of  the  Xj  by  x’,  calling  the  resulting  points  ^21^' ” 
such  a  way  that  they  are  an  admissible  set  of  extrema  of  E^^  (Cf.  Lemma  1.)  Now 
determine  Pj^  by  solving  the  N  equations  Pj^  (x^j^)  1=1,  •••,  N. 

Find  P2  by  replacing  one  of  the  x^^  by  x’  such  that  Ej^(x")  =  I  lE^I  I  ,  and  so  on. 

It  is  obvious  that  the  conclusions,  01  Theorem  1  hold  here  also,  and  that  the  y^’s 
of  Remez  increase  monotonically  (in  magnitude)  to  d.  The  rest  of  our  proof  of  con¬ 
vergence  clearly  carries  over  to  this  process  without  essential  change.  The  presence 
of  the  function  s  would  complicate  the  proofs  in  no  essential  way;  s  was  omitted 
from  the  present  report  chiefly  for  reasons  of  clarity. 
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As  a  practical  matter,  the  present  method  should  converge  faster  than  that  of 
Remez,  since  the  I  7jl  are  larger  than  the  corresponding  quantities  in  Remez's  method. 
This  quicker  convergence  is  paid  for  by  a  more  complicated  choice  of  the  admissible 
extrema  x.  at  each  stage,  however,  except  for  the  case  when  there  are  always 
exactly  N  admissible  extrema.  In  this  important  special  case,  there  is  no  more 
difficulty  in  carrying  out  the  present  method  than  there  is  in  doing  that  for  Remez's; 

For  in  order  to  find  IIe^II  one  must  find  ail  extreme  values  of  Ej . 

If  the  computation  of  f(x)  were  difficult,  then  the  method  of  the  present  report 
might  well  be  preferable  to  that  of  Remez. 

I  was  led  to  the  method  of  this  report  partly  by  ruminating  on  Hastings's  method, 
[2]  ,  as  defined  by  P.  W.  Ketchum  in  Mathematical  Reviews  [s]  (Hastings's  book  [2] 
suffers  from  a  certain  lack  of  definition).  For  me,  the  central  point  of  both  Hastings's 
method  and  my  own  is  the  "iterative  assumption"  of  stability  of  the  extrema  of  E^. 

This  point  plus  Iheorem  B  led  me  to  my  method.  I  then  noticed  the  following  com¬ 
parison:  Hastings  linearizes  the  N  non-linear  equations  in  step  2°  by  assigning  a 
numerical  value  to  yj,  thus  rediicicg  the  number  of  unknowns  to  N-1;  he  solves  N-1 
equations  and  hopes  for  a  correct  value  at  the  N-th  extrema.  I  linearize  these  equations 
by  replacing  the  unknown  coefficient  of  yj  by  a  known  number,  thus  preserving  the 
number  of  unknowns. 


^6.  EXPERIMENTAL  DATA 

The  so-called  "linear  algorithm"  of  §3  for  finding  rational  approximations  has 
been  programmed,  with  m  =  n  =  2,  for  the  Cambridge  Computer.  The  following  are  the 
fimctions  f  approximated,  the  interval  I,  the  value  of  ||  obtained,  the  smallest 

maximum  value  of  1  Ej^  I  ,  and  the  number  k  of  times  Step  2°  was  performed  in  order 
to  obtain  the  final  approximation;^^ 

5)  I  wish  to  acknowledge  most  gratefully  the  kind  assistance  of  Miss  Helen  Willett  in 
obtaining  these  results  from  the  Cambridge  Computer. 
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f 

I 

min  max  J  Ej^l 

k 

exp 

[-1,  i] 

8.71  X  10"° 

3.66  X  10"5 

4 

log 

[  M 

1.75  X  10"® 

1.68  X  10"® 

■5 

sine 

[o.6,  7.0  ] 

0.266 

0.263 

13 

In  each  of  the  above  three  "experiments",  there  were  exactly  6  =  N  extrema  of 
Ej  for  each  j;  the  extreme  values  always  alternated  properly  In  sign.  These  extrema 
were  maxima  of  I  Ej  1  .  Two  extrema  were  always  at  the  end-points  of  I. 

The  minimum  magnitude  of  the  extreme  values  is  shown  in  order  to  provide  an 
idea  of  the  closeness  of  the  last  approximation  Rj^  to  the  best  approximation  R,  In  the 
cases  exp  and  log,  these  numbers  and  those  under  the  heading  1 1 E  M  are  probably  not 
accurate  to  the  three  significant  f^res  presented,  since  they  are  the  last  three  signifi¬ 
cant  figures  of  the  eight  available  on  the  Cambridge  Computer. 

There  follow  the  values  of  the  coefficients  of  the  approximations  Rj^^  discussed 
above.  In  each  case  Rj^  has  the  form 

Rr +  aix  +  a^)  /  (x^  +  bj^  +  b^). 


^2 

^1 

% 

bl 

^0 

exp 

1. 1045366 

6. 5455741 

12.869739 

-6.3197327 

12.868806 

log 

3. 3615004 

1. 0750524 

-5.3365379 

5.6992101 

1.  9999023 

sl:i8 

.88630520 

-8.4079627 

18.363517 

-9.0106829 

21.  528251 

Some  tests  of  significance  of  digits  in  the  coefficients  of  the  approximation  to 
exp  were  made.  It  was  found  that  rounding  the  coefficients  to  seven  significant  figures 
produced  a  spread  of  0.25  x  10”®  between  minimum  and  maximum  extreme  values, 
whereas  rounding  to  six  decimal  places  pi’oduced  a  spread  of  0.03  x  10  ,  the  maxi- 


1 


mum  being  8,701  x  10”^,  the  minimum  8.  67  x  10  These  numbers  of  course  are 
potentially  in  error  because  of  round-off  in  the  machine,  but  they  indicate  that  the 
last-mentioned  roun,ilng  produces  a  result  Just  as  good  as  the  original. 

A  brief  description  of  the  program  follows.  Letting  h  denote  (b  -  a)/20,  the 

computer  finds  by  Interpolating  to  f  at  the  five  equally  spaced  points  Xj^<. .  .<  x^, 

where  x,  =  a  +  h  and  x_  «  b  -  h.  The  extrema  of  are  then  found;  At  every  stage 
1  o  ^ 

the  program  assumes  that  a  and  b  are  extrema  of  E^;  the  interior  extrema  are  found 
by  solving  Ej'  =  0  by  the  method  of  regula  falsi.  The  equations  of  step  20  are  then 
solved;  and  the  process  is  repeated.  The  criterion  for  stopping  at  j  =  k  is  that  the 
extrema  of  Ej^  be  not  too  different  from  the  corresponding  extrema  of  "'Idch  are 

stored  at  eaoh  stage.  The  coefficients  a^  and  b^  of  each  Rj  are  printed,  and  at  the  end 
the  values  of  the  extrema  of  Ej^  and  the  corresponding  extreme  values  are  printed. 

The  arbitrariness  of  the  above  procedure  for  finding  R^  can  lead  to  difficulty.  In 
particular,  for  f  =  sine,  it  gave  an  Rq  having  poles  in  I  for  I*  [o.  1,  6. 8]  and 
I  =  [O.  5,  6.  9  ].  Up  to  now,  the  method  has  "converged",  however,  so  long  as  R^^ 
had  no  poles  in  I.  As  now  programmed,  however,  it  would  probably  fail  if  some  Ej 
had  more  than  N  =  6  extrema,  or  if  a  (or  b)  were  a  minimum  of  I  Ej  I  rather  than  a 
maximum. 

At  no  point,  except  in  the  non-essential  final  print-out,  is  it  necessary  in  this 
method  to  compute  values  of  Ej. 
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APPENDIX 


(Added  "In  Proof) 

Now  that  I  have  seen  the  paper  [4]  In  translation,  let  me  describe  it  a  little 
more  fully.  Thir  paper  proves  the  convergence  of  a  process  similar  to  but  more 
general  than  my  own  polynomial  algorithm.  Specifically,  the  liussian  authors  consi¬ 
der  a  class  Q  of  functions  A  which  can  be  thought  of  as  a  generalization  of  [p  -fj  for 
a  given  continuous  f  an  all  polynomials  P  of  degree  at  most  m.  The  sequence 
Aq  ,  A^,  ...  is  constructed  by  equaling  the  values  of  Aj^  to  at  any  n  +  2 

points  satisfying  certain  properties  which  are  more  general  than  those  which  I  required. 

The  proofs  in  [4J  are  quite  different  in  execution  than  mine,  but  the  general 
similarity  of  direction  is  readily  apparent. 

It  appears  that  the  method  of  ^4|  does  not  apply  to  the  class  of  rational  approxl- 
nfations  which  I  discussed. 
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